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1 . Overview 

The main objective of this project has been to support the development of 
multigrid techniques in computational fluid dynamics that can achieve “textbook 
multigrid efficiency” (TME), which is several orders of magnitude faster than 
current industrial CFD solvers. 

Toward that goal we have assembled a detailed table which lists every foreseen 
kind of computational difficulty for achieving it, together with the possible ways 
for resolving the difficulty, their current state of development, and references. 

We have developed several codes to test and demonstrate, in the framework 
of simple model problems, several approaches for overcoming the most important 
of the listed difficulties that had not been resolved before. In particular, TME 
has been demonstrated for incompressible flows on one hand, and for near-sonic 
flows on the other hand. General approaches were advanced for the relaxation of 
stagnation points and boundary conditions under various situations. Also, new 
algebraic multigrid techniques were formed for treating unstructured grid formu- 
lations. More details on all these are given below. 

2. Table of Barriers 

During this period we have completed (in collaboration with D. Sidilkover, 
C. Swanson, J.L. Thomas and S. Taasan) a detailed table called “Barriers to 
Achieving Textbook Multigrid Efficiency in CFD” [Cl],[C3]. It lists every foreseen 
kind of computational difficulty for achieving that goal, together with the possible 
ways for resolving the difficulty, their current state of development, and references. 

Included in the table are staggered and nonstaggered, conservative and non- 
conservative discretization of viscous and inviscid, incompressible and compress- 
ible flows at various Mach numbers, as well as a simple (algebraic) turbulence 
model and comments on chemically reacting flows. The listed difficulties include: 
non-alignment of streamlines ©r sonic characteristics with grid directions; recircu- 
lating flows; stagnation points; discretization and relaxation on and near shocks 
and boundaries; far-field artificial boundary conditions; small-scale singularities 
(meaning important features, such as the complete airplane, which are not visible 
on some of the coarse grids); large grid aspect ratios; boundary layer resolution; 



and grid adaptation. 


3 . Solution methods on structured grids 

As shown before (see [Al], [A2] and [A3]), to obtain TME for any discretized 
partial differential system of equations (PDE), it is necessary and usually (with 
proper boundary treatment) also sufficient to attain that efficiency for each factor 
of the PDE principal determinant. Each such factor is a scalar differential operator 
of first or second order, so its efficient solution is a vastly simplified task. The way 
for separating the factors is by a distributed (and possibly also weighted ) relaxation 
scheme in which to each factor there corresponds a ghost discrete function. The 
latter can be directly relaxed for its corresponding factor, dictating a resulting 
pattern of changes to be distributed to the actual discrete functions (see details in 
Sec. 3.7 of [Al] and also in [A5]). To obtain the top efficiency, the relaxation of each 
ghost function should incorporate an essential part of an efficient multigrid solver 
for its corresponding operator: sometimes this essential part is just the relaxation 
part of that solver, sometimes this may even be the entire solver (applied at some 
proper subdomain). 

3.1 Incompressible systems 

For the incompressible Euler and Navier-Stokes equations, the relevant fac- 
tors are the Laplace and the convection (or convection-diffusion) operators. The 
former’s multigrid solver is classical; the latter’s can be based on downstream re- 
laxation [A3], with additional special procedures for recirculation flows [A4], [A6]. 
Indeed, it has been shown that incorporating such procedures into the relaxation 
schemes for the appropriate ghost functions yields very efficient solvers for in- 
compressible flows even at high Reynolds numbers and at second-order accuracy 

[A3]. 

In the first year of the NASA contract we have extended the results of [A3] to 
2D flows with more general ( grid cutting ) boundaries. Near such boundaries the 
interior DGS relaxation scheme is inefficient. Instead, we have used a very robust 
box relaxation (cf. [Al]). As anticipated, the “interior efficiency of the multigiid 
cycles could be attained, provided a couple of extra passes of this box scheme along 
the boundaries is added to the usual (interior) relaxation sweep. The amount of 
added computer work is negligible. The textbook efficiency of the FMG algorithm 
has been demonstrated for problems with a smooth bump in one of the side walls. 
See more details in Appendix A of [Dl]. 

3.2 The full-potential factor 

The main factor of flow systems for which no general adequate multigrid solver 
had previously been developed is the “ full potential operator 

( ud x + vdy + wdz)^ — a A , (1) 

where (u, v , w) is the flow velocity vector and a is the speed of sound. This 
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operator appears as a factor in the principal determinant of the 3-D compressible 
Euler equations. 

In the deep subsonic case (u 2 + v 2 + w 2 < .5a 2 , say) the operator (1) is 
uniformly elliptic, hence a usual multigrid V-cycle, employing red/black Gauss- 
Seidel relaxation at all levels, yields top-efficiency solvers. In the deep supersonic 
case ( u 2 +v 2 + w 2 > 1.5a 2 ) the full potential operator is uniformly hyperbolic (with 
the stream direction serving as the time-like direction), and an efficient solver can 
be obtained using downstream relaxation (marching in the time-like direction). 

The most difficult situation for solving the full potential operator is the near 
sonic regime (u 2 + v 2 + w 2 « a 2 ), especially in the (usual) case of non-alignment 
with the grid (e.g., when the grid is cartesian and no velocity component is con- 
sistently much smaller than the others). No previous multigrid approach had 
attained good efficiency in this case. 

Since (1) is just a factor of the Euler system principal determinant, its solution 
is only a (DGS) relaxation step in a multigrid Euler solver. It is therefore enough 
to confine this step to one subdomain at a time (whose size, however, is not O(h) 
but 0(1)). Without loss of generality we could therefore limit the algorithm to 
the case that throughout this subdomain the velocity is, e.g., vertically-inclined. 
In this case, the multigrid solver we have developed for (1) uses horizontal semi 
coarsening (coarsening only in the x and y directions), possibly together with 
vertical line relaxation. (This 2-line relaxation is actually not needed on the 
finest levels, but may be required after several levels of semi-coarsening.) With 
this semi coarsening, the inherent cross-characteristic numerical dissipation at the 
coarse level is smaller than at the fine one (opposite to their relation upon full 
coarsening); we can therefore stably add artificial dissipation terms at the coarse 
level so that its total cross-characteristic dissipation matches the local fine-level 
average. 

The resulting algorithm can fully exploit massively parallel processing. It can 
be extended to other non-elliptic operators, including the convection operator. 
(The approach mentioned in Sec. 3.1 for relaxing the convection operator, based 
on downstream relaxation, is not fully efficient on massively parallel machines.) 

Extensive numerical tests have been performed with this algorithm: first in 
2D, then in 3D, starting with constant-coefficients, then variable. Simple boundary 
conditions were chosen in a box: Dirichlet conditions on two opposite faces and 
periodic on the others. In 2D we have also carried out comprehensive half-space 
FMG mode analyses. All the results show that at any Mach number the algorithm 
always attains the “textbook” efficiency. The methods and results are summarized 
in several publications [C5], [C6], [C7], [CIO], [Cl 1]. 

4. New methods for unstructured grids 

The methods described above are suitable for structured grids or semi-struc- 
tured grids (i.e., uniform grids with overset patches of finer uniform grids creating 
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local refinements — which is the ideal setting for obtaining the highest efficiency 
of multigrid solvers). However, discretization schemes very common today in 
industrial CFD codes (e.g., in terms of finite elements or finite volumes) are highly 
unstructured. We have therefore decided to add to our program a special study 
on fast multigrid solvers for unstructured discretizations. 

Algebraic multigrid (AMG) algorithms are solvers of linear systems of equa- 
tions which are based on multigrid principles but do not explicitly use the ge- 
ometry of grids; see [C8], [C9], [C2], [C12], [C13]. The emphasis in AMG is on 
automatic procedures for coarsening the set of equations, relying exclusively on 
its algebraic relations. AMG is widely employed for solving discretized partial 
differential equations (PDEs) on unstructured grids (or even on structured grids 
when the coarse grid can no longer be structured, or when the PDE has highly dis- 
ordered coefficients). The scope of AMG solvers had been rather limited, though. 
Its coarsening procedures had been inadequate for general non-scalar, or high- 
order, or non-elliptic and anisotropic PDE systems, and also for non-variational 
discretizations. The purpose of our recent work [C4] has been to delineate general 
algebraic coarsening techniques that can be employed in all those cases and, in 
particular, be applicable to CFD. 

Two types of devices have been developed in [C4]. The first is a general 
criterion for gauging, and a method to control, the quality of the set of coarse- 
level variables, prior to deriving the coarse-level equations. The second includes 
general approaches for deriving the coarse-level equations once the coarse variables 
are given. 

Two such approaches have been advanced. The first one is based on the 
premise that although in principle each coarse variable depends on all others, 
this dependence decays exponentially with distance. Hence a highly accurate 
coarse equation can be constructed locally. This is done by solving a certain local 
optimization problem. We call this approach direct coarsening . 

The second approach is based on the traditional Galerkin coarsening , where 
the interpolation and restriction operators can again be very accurately derived 
by solving local optimization problems. 

In both these approaches one can control the level of coarsening accuracy, and 
the corresponding amount of computational work per coarse equation, by choosing 
the size of certain stencils: The error in approximating sufficiently “smooth” com- 
ponents (i.e., those slow to converge in relaxation) decreases exponentially with the 
size of those stencils, while the work per coarse equation increases proportionally 
to some power of that size. 

Numerical results for the advection equation are presented in [C4]. 

5. Studies with NASA scientists 

Collaborating with NASA /Langley researchers Drs. James L. Thomas, Thomas 
W. Roberts, and R. Charles Swanson, and with Boris Diskin at ICASE, we have 
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studied several particular issues of multigrid flow solvers, especially those related 

to boundary treatment and flow stagnation points. Resulting publications include 

[D2], [D3]. Some emerging general conclusions are summarized below. 

• When multigrid cycles slow down, a general useful practice is to print out, just, 
before the fine- to- coarse transition, a table of the normalized, residuals (i.e., the 
residuals for the discrete equations, each being normalized so that the sum of its 
coefficients in absolute value is 1). If the normalized residuals near boundaries 
are much larger than elsewhere, one can mend the cycles by one of two options: 
either to have more accurate residual transfers near the boundary, or to add 
extra local relaxation steps wherever needed to reduce the residuals to their 
overall average magnitude. The latter option is generally easier to implement, 
since the accurate fine-to-coarse-transfer weights sensitively depend on distances 
of various fine and coarse gridpoints to the boundary, and on the type of the 
boundary conditions. 

• The usual Distributed Gauss-Seidel (DGS) relaxation of the interior PDE system 
should in principle be extend only over those equations whose DGS relaxation 
does not touch the boundary conditions. 

• Near the boundary and on it, a simultaneous relaxation of interior equations 
and the boundary equations coupled to them should be performed. Two general 
approaches (for constructing such simultaneous relaxation steps when there is 
no natural set of discrete unknowns corresponding to each relaxed set of discrete 
equations) are the following. 

(A) A simultaneous Kacmarz scheme. (One can often just use simple Kac- 
marz for each of the boundary and near-boundary equations; but simultaneous 
schemes are more robust.) 

(B) A more specific distributive scheme; e.g., to each of the interior equations 
in the relaxed set assign the same distribution as in the interior DGS scheme, 
just avoiding distribution to variables participating in the discrete boundary 
conditions. 

• In the case that (due to aligned anisotropy of the equation or of the grid) 
the interior DGS needs be done simultaneously in lines (similarly: planes), 
one should solve simultaneously with each such line of interior equations the 
boundary equations which are coupled to them. 

• In the case of a stretched grid, which requires such line relaxation schemes, 
the interior DGS allows to relax each line of discrete equations resulting from 
the same PDE (e.g., the line of continuity equations) separately from all other 
equations — with proper coupling at the boundary as specified above. 

• On a boundary with radius of curvation comparable to the meshsize, the velocity 
error after relaxation is not smooth in the usual sense; instead, the tangential 
velocity error is smooth. Hence, the course- to- fine interpolation should be done 
in terms of the tangential velocity. In particular this is true near the leading 
and trailing edges of airfoils. 



• Near a stagnation point the relaxation scheme should be modified; one can 
no longer regard the advection coefficients as non-principal (i.e., neglect their 
imminent change when calculating a relaxation step). Thus, a full linearization 
of the advection terms should enter into the design of the relaxation steps near 
the stagnation point. Also, one cannot generally relax each momentum equation 
on the corresponding velocity alone. A general possible approach is to solve 
simultaneously all the nonlinear system in a box around the stagnation. 
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